14 research outputs found

    Predicting the Law Area and Decisions of French Supreme Court Cases

    Get PDF
    In this paper, we investigate the application of text classification methods to predict the law area and the decision of cases judged by the French Supreme Court. We also investigate the influence of the time period in which a ruling was made over the textual form of the case description and the extent to which it is necessary to mask the judge's motivation for a ruling to emulate a real-world test scenario. We report results of 96% f1 score in predicting a case ruling, 90% f1 score in predicting the law area of a case, and 75.9% f1 score in estimating the time span when a ruling has been issued using a linear Support Vector Machine (SVM) classifier trained on lexical features.Comment: RANLP 201

    Extraction of ontology schema components from financial news

    Get PDF
    In this thesis we describe an incremental multi-layer rule-based methodology for the extraction of ontology schema components from German financial newspaper text. By Extraction of Ontology Schema Components we mean the detection of new concepts and relations between these concepts for ontology building. The process of detecting concepts and relations between these concepts corresponds to the intensional part of an ontology and is often referred to as ontology learning. We present the process of rule generation for the extraction of ontology schema components as well as the application of the generated rules.In dieser Arbeit beschreiben wir eine inkrementelle mehrschichtige regelbasierte Methode für die Extraktion von Ontologiekomponenten aus einer deutschen Wirtschaftszeitung. Die Arbeit beschreibt sowohl den Generierungsprozess der Regeln für die Extraktion von ontologischem Wissen als auch die Anwendung dieser Regeln. Unter Extraktion von Ontologiekomponenten verstehen wir die Erkennung von neuen Konzepten und Beziehungen zwischen diesen Konzepten für die Erstellung von Ontologien. Der Prozess der Extraktion von Konzepten und Beziehungen zwischen diesen Konzepten entspricht dem intensionalen Teil einer Ontologie und wird im Englischen Ontology Learning genannt. Im Deutschen enspricht dies dem Lernen von Ontologien

    Measuring post-editing time and effort for different types of machine translation errors

    Get PDF
    Post-editing (PE) of machine translation (MT) is becoming more and more common in the professional translation setting. However, many users refuse to employ MT due to bad quality of the output it provides and even reject post-editing job offers. This can change by improving MT quality from the point of view of the PE process. This article investigates different types of MT errors and the difficulties they pose for PE in terms of post-editing time and technical effort. For the experiment we used English to German translations performed by MT engines. The errors were previously annotated using the MQM scheme for error annotation. The sentences were post-edited by students in translation. The experiment allowed us to make observations about the relation between technical and temporal PE effort, as well as to discover the types of errors that are more challenging for PE

    Improving translation memory matching and retrieval using paraphrases

    Get PDF
    This is an accepted manuscript of an article published by Springer Nature in Machine Translation on 02/11/2016, available online: https://doi.org/10.1007/s10590-016-9180-0 The accepted version of the publication may differ from the final published version.Most of the current Translation Memory (TM) systems work on string level (character or word level) and lack semantic knowledge while matching. They use simple edit-distance calculated on surface-form or some variation on it (stem, lemma), which does not take into consideration any semantic aspects in matching. This paper presents a novel and efficient approach to incorporating semantic information in the form of paraphrasing in the edit-distance metric. The approach computes edit-distance while efficiently considering paraphrases using dynamic programming and greedy approximation. In addition to using automatic evaluation metrics like BLEU and METEOR, we have carried out an extensive human evaluation in which we measured post-editing time, keystrokes, HTER, HMETEOR, and carried out three rounds of subjective evaluations. Our results show that paraphrasing substantially improves TM matching and retrieval, resulting in translation performance increases when translators use paraphrase-enhanced TMs

    DIVULGA.net: internacionalización de la divulgación del conocimiento científico y académico en internet

    Get PDF
    El proyecto DIVULGA.net tiene la finalidad específica de internacionalizar la difusión de conocimiento científico y académico, en una iniciativa liderada por alumnos UCM y cuyo propósito es incorporar a estudiantes de universidades extranjeras. De este modo se pueden crear sinergias que fortalezcan una red en internet de divulgación de cultura científica con los universitarios como agentes principales

    Extraktion von Ontologiekomponenten aus Finanznachrichten

    No full text
    In this thesis we describe an incremental multi-layer rule-based methodology for the extraction of ontology schema components from German financial newspaper text. By Extraction of Ontology Schema Components we mean the detection of new concepts and relations between these concepts for ontology building. The process of detecting concepts and relations between these concepts corresponds to the intensional part of an ontology and is often referred to as ontology learning. We present the process of rule generation for the extraction of ontology schema components as well as the application of the generated rules.In dieser Arbeit beschreiben wir eine inkrementelle mehrschichtige regelbasierte Methode für die Extraktion von Ontologiekomponenten aus einer deutschen Wirtschaftszeitung. Die Arbeit beschreibt sowohl den Generierungsprozess der Regeln für die Extraktion von ontologischem Wissen als auch die Anwendung dieser Regeln. Unter Extraktion von Ontologiekomponenten verstehen wir die Erkennung von neuen Konzepten und Beziehungen zwischen diesen Konzepten für die Erstellung von Ontologien. Der Prozess der Extraktion von Konzepten und Beziehungen zwischen diesen Konzepten entspricht dem intensionalen Teil einer Ontologie und wird im Englischen Ontology Learning genannt. Im Deutschen enspricht dies dem Lernen von Ontologien

    Beyond Linguistic Equivalence. An Empirical Study of Translation Evaluation in a Translation Learner Corpus

    No full text
    The realisation that fully automatic trans-lation in many settings is still far from producing output that is equal or superior to human translation has lead to an in-tense interest in translation evaluation in the MT community. However, research in this field, by now, has not only largely ig-nored the tremendous amount of relevant knowledge available in a closely related discipline, namely translation studies, but also failed to provide a deeper understand-ing of the nature of "translation errors " and "translation quality". This paper presents an empirical take on the latter concept
    corecore